1 Blogposts, Tweets, and Forums

Note: In the surveyed papers we found various instances of blogposts, tweets, and forum posts. Many of these are timestamped posts that connect user mentions, textually derived entities, hashtags, or user actions on posts. We provide a selection of Blogposts, Tweets, and MOOC (a student forum) as found in the literature. Origin Notes: Time Arcs uses AMERICAblog, Huffington Post, and other sources (corpus_ner_geo) data which can be found at https://github.com/CreativeCodingLab/TimeArcs/tree/master/Text/data.

Fast filtering uses the Twitter gardenhouse streaming API to collect data on the 2013 Super Bowl and the announcement of Osama Bin Laden’s death, both can be found https://github.com/WICI/fastviz/tree/master/data (shared hashtags).

Event-based Dynamic Graph Drawing without the Agonizing Pain uses the rugby tweet dataset (pro12_mentions) consisting of mentions among the members of the GuinessPro12 competition, and the MOOC dataset from the http://moocdata.cn/challenges/kdd-cup-2015 (A data challenge). graph features handled: Dynamic, Hypergraphs, N-layers Graph features in papers: dynamic,generic,dynamic,dynamic (continuous),large,dynamic,dynamic (discrete),layered graphs,n-layers,dynamic (discrete) Origin Paper: Understanding Dropouts in MOOCs (https://www.notion.so/Understanding-Dropouts-in-MOOCs-a97966fe379c49c9a597ac30e7b838a1?pvs=21), TimeArcs: Visualizing Fluctuations in Dynamic Networks (https://www.notion.so/TimeArcs-Visualizing-Fluctuations-in-Dynamic-Networks-e671c46ecfa444efaf28392636402266?pvs=21), Fast filtering and animation of large dynamic networks (https://www.notion.so/Fast-filtering-and-animation-of-large-dynamic-networks-2f5aa0b43a394030865509b15a945847?pvs=21) Originally found at: http://moocdata.cn/data/user-activity https://github.com/CreativeCodingLab/TimeArcs https://github.com/WICI/fastviz/tree/master/data

Size: 12-386412 nodes, 3151-556820 edges Appeared in years: 2008,2020,2014,2016,2022 Type of Collection: Aggregate collection is it stored properly?: No must be analyzed: No In repo?: Yes Related to Literature - Algorithm (1) (Dataset tag relations): The Turing Test for Graph Drawing Algorithms (https://www.notion.so/The-Turing-Test-for-Graph-Drawing-Algorithms-2a183bdf85bf4ab0acb311d5b9615440?pvs=21), TimeArcs: Visualizing Fluctuations in Dynamic Networks (https://www.notion.so/TimeArcs-Visualizing-Fluctuations-in-Dynamic-Networks-968889d3ca4a4109aca698513515e837?pvs=21), Fast filtering and animation of large dynamic networks (https://www.notion.so/Fast-filtering-and-animation-of-large-dynamic-networks-a9ecbc1aa880473b834754638c54026b?pvs=21), Online Dynamic Graph Drawing (https://www.notion.so/Online-Dynamic-Graph-Drawing-ae22e4cc10ec451bb95c2ba6cfc35499?pvs=21) cleaned format?: Yes duplicate?: No link works?: Yes Added in paper: No OSF link json: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d94add4cf7481071055619 Origin paper plaintext: Understanding Dropouts in MOOCs, TimeArcs: Visualizing Fluctuations in Dynamic Networks, Fast filtering and animation of large dynamic networks Page id: e315a82238dc40c5a3559c81ef7c57c8 unavailable/skip: No Cleaned ALL data: No OSF link gexf: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d9497a0c2b4d0e8c386228 OSF link gml: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d96da194a6be112a12e740 OSF link graphml: https://files.osf.io/v1/resources/j7ucv/providers/osfstorage/64d971111101aa0ea36a0cb6 first look: Yes sparkline data: {‘min’: 12, ‘max’: 386412, ‘step_size’: 200000, ‘num_bins’: 2, ‘bins’: [0, 200000], ‘num_nodes’: [5, 2]} Related to Literature - Algorithm (Dataset tag relations) 1: Online Dynamic Graph Drawing (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Online%20Dynamic%20Graph%20Drawing%203c5e54c02d0b473294442f7387ddb03d.md), The Turing Test for Graph Drawing Algorithms (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/The%20Turing%20Test%20for%20Graph%20Drawing%20Algorithms%209927a9580ae74b49a468a8c2816334da.md), Fast filtering and animation of large dynamic networks (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Fast%20filtering%20and%20animation%20of%20large%20dynamic%20netw%2004f8b4c82871465fb46f8ad2a01d6815.md), TimeArcs: Visualizing Fluctuations in Dynamic Networks (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/TimeArcs%20Visualizing%20Fluctuations%20in%20Dynamic%20Netwo%209d27b7e02aec4b80bc15447255eb4f4c.md), Event-based Dynamic Graph Drawing without the Agonizing Pain (../Benchmark%20sets%200cc6b5e454304aec98f3b59b1a720476/Literature%20ad87f14e7097454fb2f784e2c8a2797a/Literature%20-%20Algorithm%2012e01bfc60a84007aa7d2d34293e123d/Event-based%20Dynamic%20Graph%20Drawing%20without%20the%20Agon%20e67037f1481b48fab8cbd0c2802fcbe5.md)

2 Body

Statistics

four_in_one.svg

2.1 Blogs/MOOC

Descriptions from Literature

From TimeArcs: Visualizing Fluctuations in Dynamic Networks:

We collected 90,811 political blog posts over a ten-year period from 2005 to 2015 from seven different sources, including AMERICAblog, Huffington Post, and ProPublica. We then ran text analyses on these blogs and generated terms that were classified into four different categories. These terms were then input into TimeArcs.

From Event-based Dynamic Graph Drawing without the Agonizing Pain:

MOOC represents the actions (e.g. viewing a video, submitting an answer, etc.) taken by users on a popular massive open online class platform [KZL19]. The nodes represent users and course activities (targets), and temporal edges represent the actions by users on the targets. We pick and elaborate the first 15 thousands events.

Example Figures

From TimeArcs: Visualizing Fluctuations in Dynamic Networks:

Untitled

Fig. 7. Overview of political events in the past 10 years using TimeArcs. The top 100 terms were selected based on their sudden attention and degree centrality. Terms are color-coded by category: green for person, red for location, blue for organization, yellow for miscellaneous category.

From Event-based Dynamic Graph Drawing without the Agonizing Pain:

Untitled

Table 2. Flattened snapshots of the network evolution over time taken at regular intervals. Twenty artificial time slices were inserted for temporal graphs

2.2 Tweets

Descriptions from Literature

From Fast filtering and animation of large dynamic networks:

We use data obtained through the Twitter gardenhose streaming API, which covers around 10% of the tweet volume.We focus on two events: the announcement of Osama bin Laden’s death and the 2013 Super Bowl. We consider user mentions and hashtags as entities and their co-occurrence in the same tweet as interactions between them.

From Event-based Dynamic Graph Drawing without the Agonizing Pain:

Rugby is a network derived from over 3000 tweets involving teams in the ‘Guinness Pro12’ rugby competition. The tweets were posted between 1 September 2014 and 23 October 2015. Each tweet contains information about the involved teams and the time of publication with a precision down to the second.

Example Figures

From Fast filtering and animation of large dynamic networks

Untitled

== STOP RENDERING ==

Online dynamic - cannot find code online, paper shows two links of where they collected the data. Both broken http://www.dailytech.com, http://www.rimzu.com - paper says to look in https://www.computer.org/csdl/journal/tg/2008/04/ttg2008040727/13rRUxBJhvo but it is asking me to pay

Turing Test - Uses the Zachary Karate dataset (they link to the suiteSparse Matrix collection, but I also know it is in sparse, and one of the Pajek subcollections). Also uses a timeslice of the same dataset used by Online dynamic (we could reconstruct this from their images since it is also only 85 nodes, but we really loose all the info we want)